These libraries are preferred over others for their batch transform capabilities. Being able to apply these transforms on batches allow us to use GPUs and get a speed up around ~10-20x depending on the input image size.
Kornia has ability to pass same_on_batch
argument. If it's set to False, then augmentation will be randomly applied to elements of a batch.
jitter
: Do color jitter or not bw
: Do grayscale or not, blur
: do blur or not, resize_scale
: (min,max) scales for random resized crop, resize_ratio
: (min,max) aspect ratios to use for random resized crop, s
: scalar for color jitter, blur_s
: (min, max) or single int for blur.
Their corresponding probabilities: flip_p, jitter_p, bw_p, blur_p
Kornia augmentation implementations have two additional parameters compare to TorchVision, return_transform and same_on_batch. The former provides the ability of undoing one geometry transformation while the latter can be used to control the randomness for a batched transformation. To enable those behaviour, you may simply set the flags to True.
Recommendation: Even though defaults work very well on many benchmark datasets it's always better to try different values and visualize your dataset before going further with training.
kornia RandomResizedCrop in overall looks more zoomed in. Might be related to sampling function used for scale?
aug, n = get_kornia_batch_augs(336, resize_scale=(0.2,1), stats=imagenet_stats, cuda=False, same_on_batch=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n):
show_image(t1,ax=ax[i][0])
show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
GPU batch transforms are ~10x - ~20x faster than CPU depending on image size. Larger image sizes benefit from the GPU more.
xb = (torch.stack([t1]*32))
aug= get_kornia_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
if torch.cuda.is_available():
xb = xb.to(default_device())
aug = get_kornia_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
out = aug(xb) # ignore: GPU warmup
torch.cuda.synchronize()
same_on_batch=False
%%timeit
if torch.cuda.is_available():
out = aug(xb)
torch.cuda.synchronize()
if torch.cuda.is_available():
xb = xb.to(default_device())
aug = get_kornia_batch_augs(336, resize_scale=(0.75,1), same_on_batch=True, stats=imagenet_stats)
same_on_batch=True
%%timeit
if torch.cuda.is_available():
out = aug(xb)
torch.cuda.synchronize()
Torchvision doesn't have a same_on_batch
parameter, it also doesn't support jitter_p
.
aug, n = get_torchvision_batch_augs(336, resize_scale=(0.2, 1), stats=imagenet_stats, cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n):
show_image(t1,ax=ax[i][0])
show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
Torchvision is slightly faster than kornia with same_on_batch=False
.
xb = (torch.stack([t1]*32))
aug= get_torchvision_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
if torch.cuda.is_available():
xb = xb.to(default_device())
aug = get_torchvision_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
out = aug(xb)
torch.cuda.synchronize()
In fastai few of the batch transforms are named differently, that is why it is not used as first choice. There might be better or worse implementation difference. Although, in general fastai has a faster and more accurate batch transform through a composition function called setup_aug_tfms
.
Here, max_lightning
for color jitter magnitude.
Fastai is as fast as the combination of kornia and torchvision, but it should be noted that RandomResizedCropGPU
applies same crop to all elements (which is probably fine) similar to torchvision and color jittering is implemented in 4 separate transforms.
aug, n = get_fastai_batch_augs(336, min_scale=0.2, stats=imagenet_stats, cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n):
show_image(t1,ax=ax[i][0])
show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
xb = (torch.stack([t1]*32))
aug = get_fastai_batch_augs(336, min_scale=0.75, stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
if torch.cuda.is_available():
xb = xb.to(default_device())
aug = get_fastai_batch_augs(336, min_scale=0.75, stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
out = aug(xb)
torch.cuda.synchronize()
Here we use RandomResizedCrop
from torchvision and keep the remaining augmentations same as kornia. This is kind of best of both worlds - fast and diverse augmentations.
Also, Rotate
from fastai is used for reflection padding.
aug, n = get_batch_augs(336, resize_scale=(0.2, 1), stats=imagenet_stats,cuda=False), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n):
show_image(t1,ax=ax[i][0])
show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
Torchvision is slightly faster than kornia same_on_batch=True
.
xb = (torch.stack([t1]*32))
aug = get_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats, cuda=False)
%%timeit
out = aug(xb)
if torch.cuda.is_available():
xb = xb.to(default_device())
aug = get_batch_augs(336, resize_scale=(0.75,1), stats=imagenet_stats)
%%timeit
if torch.cuda.is_available():
out = aug(xb)
torch.cuda.synchronize()
You can simply add any batch transform by passing it as list to xtra_tfms
.
aug, n = get_batch_augs(336, resize_scale=(0.2, 1), stats=imagenet_stats, cuda=False, xtra_tfms=[RandomErasing(p=1.)]), 5
fig,ax = plt.subplots(n,2,figsize=(8,n*4))
for i in range(n):
show_image(t1,ax=ax[i][0])
show_image(aug.decode(aug(t1)).clamp(0,1)[0], ax=ax[i][1])
augs = get_multi_aug_pipelines(n=2,size=224)
assert_aug_pipelines(augs)